Sigmoid Loss for Language Image Pre-Training